Distributed Community Detection on Edge-labeled Graphs using Spark

نویسندگان

  • San-Chuan Hung
  • Miguel Araujo
  • Christos Faloutsos
چکیده

How can we detect communities in graphs with edge-labels, such as time-evolving networks or edge-colored graphs? Unlike classical graphs, edge-labels contain additional information about the type of edges, e.g., when two people got connected, or which company hosts the air route between two cities. We model community detection on edge-labeled graphs as a tensor decomposition problem and propose TeraCom, a distributed system that is able to scale in order to solve this problem on 10x larger graphs. By carefully designing our algorithm and leveraging the Spark framework, we show how to achieve better accuracy (in terms of recovering ground-truth communities) when compared to PARAFAC methods up to 30% increase in NMI. We also present interesting clusters discovered by our system in a flights network.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Spectral Clustering and Community Detection in Labeled Graphs

We study spectral clustering techniques to learn community structures in labeled random graphs where edge labels from a label set L = {1, ..., L} are drawn according to discrete probability distributions parametrized by community membership of the two end-nodes of the edge. This is a strict generalization of the standard stochastic block model for community detection.

متن کامل

Mr-ecocd: an Edge Clustering Algorithm for Overlapping Community Detection on Large-scale Network Using Mapreduce

Overlapping community detection is progressively becoming an important issue in complex networks. Many in-memory overlapping community detection algorithms have been proposed for graphs with thousands of nodes. However, analyzing massive graphs with millions of nodes is impossible for the traditional algorithm. In this paper, we propose MR-ECOCD, a novel distributed computation algorithm using ...

متن کامل

A note on 3-Prime cordial graphs

Let G be a (p, q) graph. Let f : V (G) → {1, 2, . . . , k} be a map. For each edge uv, assign the label gcd (f(u), f(v)). f is called k-prime cordial labeling of G if |vf (i) − vf (j)| ≤ 1, i, j ∈ {1, 2, . . . , k} and |ef (0) − ef (1)| ≤ 1 where vf (x) denotes the number of vertices labeled with x, ef (1) and ef (0) respectively denote the number of edges labeled with 1 and not labeled with 1....

متن کامل

4-Prime cordiality of some classes of graphs

Let G be a (p, q) graph. Let f : V (G) → {1, 2, . . . , k} be a map. For each edge uv, assign the label gcd (f(u), f(v)). f is called k-prime cordial labeling of G if |vf (i) − vf (j)| ≤ 1, i, j ∈ {1, 2, . . . , k} and |ef (0) − ef (1)| ≤ 1 where vf (x) denotes the number of vertices labeled with x, ef (1) and ef (0) respectively denote the number of edges labeled with 1 and not labeled with 1....

متن کامل

DFEP: Distributed Funding-Based Edge Partitioning

As graphs become bigger, the need to efficiently partition them becomes more pressing. Most graph partitioning algorithms subdivide the vertex set into partitions of similar size, trying to keep the number of cut edges as small as possible. An alternative approach divides the edge set, with the goal of obtaining more balanced partitions in presence of high-degree nodes, such as hubs in real wor...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016